AITopics | Albany

Large Language Models exhibit impressive reasoning capabilities across diverse tasks, motivating efforts to distill these capabilities into smaller models through generated reasoning data. However, direct training on such synthesized reasoning data may lead to superficial imitation of reasoning process, rather than fostering a genuine integration of reasoning capabilities with underlying knowledge. To address this, we propose TinyThinker, a framework introducing two novel approaches. First, we introduce a three-stage process that incrementally guides the student model through the reasoning process, progressively refining knowledge from coarse to fine granularity. Second, we develop a two-phase training framework comprising an initial reasoning acquisition phase followed by a self-reflection phase utilizing self-generated data. Experiments on commonsense reasoning benchmarks demonstrate that TinyThinker achieves superior performance compared to baselines. Ablation studies further validate the effectiveness of each component in our framework. TinyThinker is extendable to other knowledge-intensive reasoning tasks, offering an alternative strategy for developing effective reasoning capabilities in smaller language models. Codes are available at https://github.com/shengminp/TinyThinker

artificial intelligence, large language model, natural language, (14 more...)

arXiv.org Artificial Intelligence

2412.08024

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Ontario > Toronto (0.05)
Asia > Singapore (0.04)
(8 more...)

Genre: Research Report (1.00)

Industry: Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Dynamic graph neural networks for enhanced volatility prediction in financial markets

Kumar, Pulikandala Nithish, Umeorah, Nneka, Alochukwu, Alex

arXiv.org Artificial IntelligenceOct-22-2024

Volatility forecasting is essential for risk management and decision-making in financial markets. Traditional models like Generalized Autoregressive Conditional Heteroskedasticity (GARCH) effectively capture volatility clustering but often fail to model complex, non-linear interdependencies between multiple indices. This paper proposes a novel approach using Graph Neural Networks (GNNs) to represent global financial markets as dynamic graphs. The Temporal Graph Attention Network (Temporal GAT) combines Graph Convolutional Networks (GCNs) and Graph Attention Networks (GATs) to capture the temporal and structural dynamics of volatility spillovers. By utilizing correlation-based and volatility spillover indices, the Temporal GAT constructs directed graphs that enhance the accuracy of volatility predictions. Empirical results from a 15-year study of eight major global indices show that the Temporal GAT outperforms traditional GARCH models and other machine learning methods, particularly in short- to mid-term forecasts. The sensitivity and scenario-based analysis over a range of parameters and hyperparameters further demonstrate the significance of the proposed technique. Hence, this work highlights the potential of GNNs in modeling complex market behaviors, providing valuable insights for financial analysts and investors.

artificial intelligence, machine learning, volatility, (17 more...)

arXiv.org Artificial Intelligence

2410.16858

Country:

Europe > United Kingdom (0.28)
Asia > South Korea (0.14)
North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
(6 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Banking & Finance > Trading (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dubo-SQL: Diverse Retrieval-Augmented Generation and Fine Tuning for Text-to-SQL

Thorpe, Dayton G., Duberstein, Andrew J., Kinsey, Ian A.

arXiv.org Artificial IntelligenceApr-18-2024

The current state-of-the-art (SOTA) for automated text-to-SQL still falls well short of expert human performance as measured by execution accuracy (EX) on the BIRD-SQL benchmark. The most accurate methods are also slow and expensive. To advance the SOTA for text-to-SQL while reducing cost and improving speed, we explore the combination of low-cost fine tuning, novel methods for diverse retrieval-augmented generation (RAG) and new input and output formats that help large language models (LLMs) achieve higher EX. We introduce two new methods, Dubo-SQL v1 and v2. Dubo-SQL v1 sets a new record for EX on the holdout test set of BIRD-SQL. Dubo-SQL v2 achieves even higher performance on the BIRD-SQL dev set. Dubo-SQL v1 relies on LLMs from OpenAI, but uses the low-cost GPT-3.5 Turbo while exceeding the performance of the next-best model using OpenAI, which instead uses the more expensive GPT-4. Dubo-SQL v1 exceeds the performance of the next-best model using GPT-3.5 by over 20%. Dubo-SQL v2 uses GPT-4 Turbo and RAG in place of fine tuning to push EX higher.

database, execution accuracy, few-shot example, (13 more...)

arXiv.org Artificial Intelligence

2404.1256

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Ohio > Summit County > Akron (0.04)
(6 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.46)

Add feedback

Conceptual and Unbiased Reasoning in Language Models

Zhou, Ben, Zhang, Hongming, Chen, Sihao, Yu, Dian, Wang, Hongwei, Peng, Baolin, Roth, Dan, Yu, Dong

arXiv.org Artificial IntelligenceMar-29-2024

Conceptual reasoning, the ability to reason in abstract and high-level perspectives, is key to generalization in human cognition. However, limited study has been done on large language models' capability to perform conceptual reasoning. In this work, we bridge this gap and propose a novel conceptualization framework that forces models to perform conceptual reasoning on abstract questions and generate solutions in a verifiable symbolic space. Using this framework as an analytical tool, we show that existing large language models fall short on conceptual reasoning, dropping 9% to 28% on various benchmarks compared to direct inference methods. We then discuss how models can improve since high-level abstract reasoning is key to unbiased and generalizable decision-making. We propose two techniques to add trustworthy induction signals by generating familiar questions with similar underlying reasoning paths and asking models to perform self-refinement. Experiments show that our proposed techniques improve models' conceptual reasoning performance by 8% to 11%, achieving a more robust reasoning system that relies less on inductive biases.

original question, reasoning, str, (16 more...)

arXiv.org Artificial Intelligence

2404.00205

Country:

Pacific Ocean > North Pacific Ocean > Sea of Japan (0.05)
Asia > Japan (0.05)
Asia > China > Hong Kong (0.05)
(9 more...)

Genre:

Research Report (0.64)
Personal > Honors (0.47)

Industry:

Leisure & Entertainment (1.00)
Education (0.93)
Media > Music (0.70)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)

Add feedback

Towards Uncertainty-Aware Language Agent

Han, Jiuzhou, Buntine, Wray, Shareghi, Ehsan

arXiv.org Artificial IntelligenceFeb-7-2024

While Language Agents have achieved promising success by placing Large Language Models at the core of a more versatile design that dynamically interacts with the external world, the existing approaches neglect the notion of uncertainty during these interactions. We present the Uncertainty-Aware Language Agent (UALA), a framework that orchestrates the interaction between the agent and the external world using uncertainty quantification. Compared with other well-known counterparts like ReAct, our extensive experiments across 3 representative tasks (HotpotQA, StrategyQA, MMLU) and various LLM sizes demonstrate that UALA brings a significant improvement of performance, while having a substantially lower reliance on the external world (i.e., reduced number of tool calls and tokens). Our analyses provide various insights including the great potential of UALA compared with agent fine-tuning, and underscore the unreliability of verbalised confidence of LLMs as a proxy for uncertainty.

calibration, estimation, hotpotqa, (15 more...)

arXiv.org Artificial Intelligence

2401.14016

Country:

North America > United States > Colorado (0.05)
North America > United States > New York > Albany County > Albany (0.04)
North America > United States > Georgia > Dougherty County > Albany (0.04)
(12 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Media > Film (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains

Jacovi, Alon, Bitton, Yonatan, Bohnet, Bernd, Herzig, Jonathan, Honovich, Or, Tseng, Michael, Collins, Michael, Aharoni, Roee, Geva, Mor

arXiv.org Artificial IntelligenceFeb-2-2024

Prompting language models to provide step-by-step answers (e.g., "Chain-of-Thought") is the prominent approach for complex reasoning tasks, where more accurate reasoning chains typically improve downstream task performance. Recent literature discusses automatic methods to verify reasoning steps to evaluate and improve their correctness. However, no fine-grained step-level datasets are available to enable thorough evaluation of such verification methods, hindering progress in this direction. We introduce Reveal: Reasoning Verification Evaluation, a new dataset to benchmark automatic verifiers of complex Chain-of-Thought reasoning in open-domain question answering settings. Reveal includes comprehensive labels for the relevance, attribution to evidence passages, and logical correctness of each reasoning step in a language model's answer, across a wide variety of datasets and state-of-the-art language models.

correctness, dataset, inference, (14 more...)

arXiv.org Artificial Intelligence

2402.00559

Country:

North America > United States > Georgia > Dougherty County > Albany (0.14)
North America > United States > Texas (0.05)
North America > United States > California (0.05)
(12 more...)

Genre: Workflow (1.00)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment > Sports > Soccer (0.67)
Leisure & Entertainment > Sports > Basketball (0.67)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Can Language Models Teach Weaker Agents? Teacher Explanations Improve Students via Personalization

Saha, Swarnadeep, Hase, Peter, Bansal, Mohit

arXiv.org Artificial IntelligenceNov-14-2023

A hallmark property of explainable AI models is the ability to teach other agents, communicating knowledge of how to perform a task. While Large Language Models perform complex reasoning by generating explanations for their predictions, it is unclear whether they also make good teachers for weaker agents. To address this, we consider a student-teacher framework between two LLM agents and study if, when, and how the teacher should intervene with natural language explanations to improve the student's performance. Since communication is expensive, we define a budget such that the teacher only communicates explanations for a fraction of the data, after which the student should perform well on its own. We decompose the teaching problem along four axes: (1) if teacher's test time intervention improve student predictions, (2) when it is worth explaining a data point, (3) how the teacher should personalize explanations to better teach the student, and (4) if teacher explanations also improve students on future unexplained data. We first show that teacher LLMs can indeed intervene on student reasoning to improve their performance. Next, inspired by the Theory of Mind abilities of effective teachers, we propose building two few-shot mental models of the student. The first model defines an Intervention Function that simulates the utility of an intervention, allowing the teacher to intervene when this utility is the highest and improving student performance at lower budgets. The second model enables the teacher to personalize explanations for a particular student and outperform unpersonalized teachers. We also demonstrate that in multi-turn interactions, teacher explanations generalize and learning from explained data improves student performance on future unexplained data. Finally, we verify that misaligned teachers can lower student performance to random chance by intentionally misleading them.

explanation, intervention, student, (14 more...)

arXiv.org Artificial Intelligence

2306.09299

Country:

North America > United States > Georgia > Dougherty County > Albany (0.14)
North America > United States > New York (0.04)
North America > United States > Pennsylvania (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education > Assessment & Standards > Student Performance (0.75)
Government > Regional Government > North America Government > United States Government (0.67)
Health & Medicine > Therapeutic Area > Obstetrics/Gynecology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)

Add feedback

Beyond Words: A Mathematical Framework for Interpreting Large Language Models

González, Javier, Nori, Aditya V.

arXiv.org Artificial IntelligenceNov-6-2023

Large language models (LLMs) are powerful AI tools that can generate and comprehend natural language text and other complex information. However, the field lacks a mathematical framework to systematically describe, compare and improve LLMs. We propose Hex a framework that clarifies key terms and concepts in LLM research, such as hallucinations, alignment, self-verification and chain-of-thought reasoning. The Hex framework offers a precise and consistent way to characterize LLMs, identify their strengths and weaknesses, and integrate new findings. Using Hex, we differentiate chain-of-thought reasoning from chain-of-thought prompting and establish the conditions under which they are equivalent. This distinction clarifies the basic assumptions behind chain-of-thought prompting and its implications for methods that use it, such as self-verification and prompt programming. Our goal is to provide a formal framework for LLMs that can help both researchers and practitioners explore new possibilities for generative AI. We do not claim to have a definitive solution, but rather a tool for opening up new research avenues. We argue that our formal definitions and results are crucial for advancing the discussion on how to build generative AI systems that are safe, reliable, fair and robust, especially in domains like healthcare and software engineering.

diagram, llm, reasoning, (16 more...)

arXiv.org Artificial Intelligence

2311.03033

Country:

Europe > United Kingdom (0.14)
North America > United States > Illinois > Cook County > Chicago (0.07)
Europe > France (0.04)
(4 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine (0.48)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.55)

Add feedback